Search CORE

64 research outputs found

Excursion Risk

Author: Ananova Anna
Cont Rama
Xu Renyuan
Publication venue
Publication date: 05/11/2020
Field of study

The risk and return profiles of a broad class of dynamic trading strategies, including pairs trading and other statistical arbitrage strategies, may be characterized in terms of excursions of the market price of a portfolio away from a reference level. We propose a mathematical framework for the risk analysis of such strategies, based on a description in terms of price excursions, first in a pathwise setting, without probabilistic assumptions, then in a Markovian setting. We introduce the notion of delta-excursion, defined as a path which deviates by delta from a reference level before returning to this level. We show that every continuous path has a unique decomposition into delta-excursions, which is useful for the scenario analysis of dynamic trading strategies, leading to simple expressions for the number of trades, realized profit, maximum loss and drawdown. As delta is decreased to zero, properties of this decomposition relate to the local time of the path. When the underlying asset follows a Markov process, we combine these results with Ito's excursion theory to obtain a tractable decomposition of the process as a concatenation of independent delta-excursions, whose distribution is described in terms of Ito's excursion measure. We provide analytical results for linear diffusions and give new examples of stochastic processes for flexible and tractable modeling of excursions. Finally, we describe a non-parametric scenario simulation method for generating paths whose excursion properties match those observed in empirical data.Comment: 36 pages; 10 figure

arXiv.org e-Print Archive

Risk-Aware Linear Bandits: Theory and Applications in Smart Order Routing

Author: Ji Jingwei
Xu Renyuan
Zhu Ruihao
Publication venue
Publication date: 03/08/2022
Field of study

Motivated by practical considerations in machine learning for financial decision-making, such as risk-aversion and large action space, we initiate the study of risk-aware linear bandits. Specifically, we consider regret minimization under the mean-variance measure when facing a set of actions whose rewards can be expressed as linear functions of (initially) unknown parameters. Driven by the variance-minimizing G-optimal design, we propose the Risk-Aware Explore-then-Commit (RISE) algorithm and the Risk-Aware Successive Elimination (RISE++) algorithm. Then, we rigorously analyze their regret upper bounds to show that, by leveraging the linear structure, the algorithms can dramatically reduce the regret when compared to existing methods. Finally, we demonstrate the performance of the algorithms by conducting extensive numerical experiments in a synthetic smart order routing setup. Our results show that both RISE and RISE++ can outperform the competing methods, especially in complex decision-making scenarios

arXiv.org e-Print Archive

Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon

Author: Hambly Ben
Xu Renyuan
Yang Huining
Publication venue
Publication date: 01/01/2021
Field of study

We explore reinforcement learning methods for finding the optimal policy in the linear quadratic regulator (LQR) problem. In particular, we consider the convergence of policy gradient methods in the setting of known and unknown parameters. We are able to produce a global linear convergence guarantee for this approach in the setting of finite time horizon and stochastic state dynamics under weak assumptions. The convergence of a projected policy gradient method is also established in order to handle problems with constraints. We illustrate the performance of the algorithm with two examples. The first example is the optimal liquidation of a holding in an asset. We show results for the case where we assume a model for the underlying dynamics and where we apply the method to the data directly. The empirical evidence suggests that the policy gradient method can learn the global optimal solution for a larger class of stochastic systems containing the LQR framework and that it is more robust with respect to model mis-specification when compared to a model-based approach. The second example is an LQR system in a higher dimensional setting with synthetic data.Comment: 49 pages, 9 figure

arXiv.org e-Print Archive

Oxford University Research Archive

Policy gradient methods find the Nash equilibrium in N-player general-sum linear-quadratic games

Author: Hambly Benjamin
Xu Renyuan
Yang Huining
Publication venue: Journal of Machine Learning Research
Publication date: 01/04/2023
Field of study

We consider a general-sum N-player linear-quadratic game with stochastic dynamics over a finite horizon and prove the global convergence of the natural policy gradient method to the Nash equilibrium. In order to prove convergence of the method we require a certain amount of noise in the system. We give a condition, essentially a lower bound on the covariance of the noise in terms of the model parameters, in order to guarantee convergence. We illustrate our results with numerical experiments to show that even in situations where the policy gradient method may not converge in the deterministic setting, the addition of noise leads to convergence

Oxford University Research Archive